Flexible Analog Search with Kernel PCA Embedded Molecule Vectors

نویسندگان

  • Stefano Rensi
  • Russ B. Altman
چکیده

Studying analog series to find structural transformations that enhance the activity and ADME properties of lead compounds is an important part of drug development. Matched molecular pair (MMP) search is a powerful tool for analog analysis that imitates researchers' ability to select pairs of compounds that differ only by small well-defined transformations. Abstraction is a challenge for existing MMP search algorithms, which can result in the omission of relevant, inexact MMPs, and inclusion of irrelevant, contextually dissimilar MMPs. In this work, we present a new method for MMP search that returns approximate results and enables flexible control over abstraction of contextual information. We illustrate the concepts and mechanics of our method with a series of exemplar MMP queries, and then benchmark search accuracy using MMPs found by fragment indexing. We show that we can search for MMPs in a context dependent manner, and accurately approximate context independent fragment index based MMP search over a range of fingerprint and dataset conditions. Our method can be used to search for pairwise correspondences among analog sets and bolster MMP datasets where data is missing or incomplete.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recognizing Faces using Kernel Eigenfaces and Support Vector Machines

In face recognition, Principal Component Analysis (PCA) is often used to extract a low dimensional face representation based on the eigenvector of the face image autocorrelation matrix. Kernel Principal Component Analysis (Kernel PCA) has recently been proposed as a non-linear extension of PCA. While PCA is able to discover and represent linearly embedded manifolds, Kernel PCA can extract low d...

متن کامل

Support Vector Machine Approximation using Kernel PCA

Support Vector Machine is a very important technique used for classification and regression. Although very accurate, the speed of SVM classification decreases with increase in the number of support vectors. This paper describes one method of reducing the number of support vectors through the application of Kernel PCA. This method is different from other proposed methods as we show that the exac...

متن کامل

Massively Parallel Mixed-Signal VLSI Kernel Machines

Recently it has been shown that a simple learning paradigm, the support vector machine (SVM), outperforms some of the most elaborately tuned expert systems and neural networks in object recognition tasks. In run-time, the SVM operates by computing a kernelbased distance between the object’s vector at the input and a set of support vectors selected from the training set, and weighting the result...

متن کامل

Speedup of kernel eigenvoice speaker adaptation by embedded kernel PCA

Recently, we proposed an improvement to the eigenvoice (EV) speaker adaptation called kernel eigenvoice (KEV) speaker adaptation. In KEV adaptation, eigenvoices are computed using kernel PCA, and a new speaker’s adapted model is implicitly computed in the kernel-induced feature space. Due to many online kernel evaluations, both adaptation and subsequent recognition of KEV adaptation are slower ...

متن کامل

A modified concept of PCA to reduce the classification error using kernel SVM classifier

This paper focuses on the mathematical technique PCA with the drawback of its mixing of data pixel. We have extracted principal directions of the covariance ellipse as done in PCA, but we will not blindly take the Eigen vectors corresponding to k largest values. Instead, we transform the data vectors into the new n– dimensional (n is dimension of old input space) vector space spanned by the Eig...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2017